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Abstract 

We consider a coordination game between an informed sender and an uninformed 
decision maker, the receiver, who communicate over a noisy channel. The sender's 
strategy, called a code, maps states of nature to signals. The receiver's best response 
is to decode the received channel output as the state with highest expected receiver 
payoff. Given this decoding, an equilibrium or "Nash code" results if the sender en- 
codes every state as prescribed. We show two theorems that give sufficient conditions 
for Nash codes. First, a receiver-optimal code defines a Nash code. A second, more 
surprising observation holds for communication over a binary channel which is used 
independently a number of times, a basic model of information transmission: Un- 
der a minimal "monotonicity" requirement for breaking ties when decoding, which 
holds generically, any code is a Nash code. 
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1 Introduction 



Many economic interactions involve information transmission, which is often modeled as 
a sender-receiver game between an informed expert and an uninformed decision maker. 
Information is not always transmitted faithfully. This "noise" may be strategic, as demon- 
strated in the many examples of signalling games due to conflicting incentives of sender 
and receiver (see Spence, 1973, and the surveys by Kreps and Sobel, 1994, and Sobel, 
2010). A different kind of noise is due to unintended communication errors, such as dis- 
torted signals or imprecisely worded or misunderstood messages. This noise is considered 
in information theory (Shannon, 1948) and in studies of language and ambiguity (Nowak 
and Krakauer, 1999). 

We study noisy information transmission as a sender-receiver game where the interests of 
sender and receiver coincide. One of finitely many states of nature is chosen at random. 
The sender is informed the state and transmits a signal via a discrete noisy channel to an 
uninformed receiver who makes a decision. 

The sender's strategy or code assigns to each state of nature a specific signal or "code- 
word" that is the input to the channel. The receiver's strategy decodes the distorted signal 
that is the channel output as one of the possible states. Both players receive a positive 
payoff only if the state is decoded correctly, otherwise payoff zero. 

In equilibrium, the receiver decodes the channel output as the state with highest expected 
payoff. This receiver condition is the well-known "maximum likelihood" decoding in the 
special case of uniform priors and equal utilities. The equilibrium condition for the sender 
means that she chooses for each state the prescribed codeword as her best response, that 
is, no other channel input has a higher probability of being decoded correctly with the 
given receiver strategy. 

A Nash code is a code together with a best-response decoding that defines a Nash equilib- 
rium. So we assume the straightforward equilibrium condition for the receiver and require 
that the code fulfills the more involved sender condition. (Of course, both conditions are 
necessary for equilibrium.) 

We present two main results about Nash codes. For arbitrary discrete channels, not every 
code defines a Nash equilibrium. However, a Nash code results if the expected payoff to 
the receiver cannot be increased by replacing a single codeword with another one (The- 
orem |5]). So these "receiver-optimal" codes are Nash codes. This is closely related to 
potential games and provides a method to construct Nash codes (Proposition [6]). 

Our second, more surprising and technically challenging result concerns the binary chan- 
nel where codewords are strings of bits with independent error probabilities for each bit, 
a fundamental model in information transmission. Then any code is a Nash code (The- 
orem [8]). The only requirement for the decoding is that the receiver breaks ties between 
states in a consistent manner; this holds for natural tie-breaking rules, and ties do not even 
occur if states of nature have different generic prior probabilities or utilities. So binary 
codes, as Nash codes, are very suitable for information transmission because the agents 
never have an incentive to deviate from them. 
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Sender-receiver games studied in the literature typically assume communication without 
transmission errors. In their seminal paper, Crawford and Sobel (1982) study such a 
game where, unlike in our games, the interests of sender and receiver do not coincide. In 
equilibrium, the sender only reveals partial information about the state, which can be seen 
as noise being introduced strategically. 

Even in rather simple sender-receiver games, players can get higher equilibrium payoffs 
when communicating over a channel with noise than with perfect communication (Myer- 
son, 1994, Section 4). Blume, Board, and Kawamura (2007) extend the model by Craw- 
ford and Sobel (1982) by assuming communication errors. The noise allows for equilibria 
that improve welfare compared to the Crawford-Sobel model. The construction partly de- 
pends on the specific form of the errors so that erroneous transmissions can be identified; 
this does not apply in our discrete model. In addition, in our model players only get posi- 
tive payoff when the receiver decodes the state correctly, unlike in the continuous models 
by Crawford and Sobel (1982) and Blume et al. (2007). On the other hand, compared to 
perfect communication, noise may prevent players from achieving common knowledge 
about the state of nature (Koessler, 2001). 

Game-theoretic models of communication have been used in the study of language. Lewis 
(1969) describes language as a "convention" with mappings between states and signals, 
and argues that these should be bijections. Nowak and Krakauer (1999) use evolution- 
ary game theory to show how languages may evolve from "noisy" mappings; Warneryd 
(2003) shows that only bijections are evolutionary stable. However, even ambiguous 
sender mappings (where one signal is used for more than one state) together with a mixed 
receiver population may be "neutrally stable" (Pawlowitsch, 2008); the randomized re- 
ceiver strategy can be seen as noise. Blume and Board (2009) use the noisy channel to 
model vagueness in communication. Lipman (2009) discusses how vagueness can arise 
even for coinciding interests of sender and receiver. Ambiguous signals arise when the 
set of messages is smaller than the of states, which may reflect communication costs for 
the sender (Jager, Koch-Metzger, and Riedel, 201 1). For the sender-receiver game with a 
noisy binary channel, Hernandez, Urbano, and Vila (2010a) describe the equilibria for a 
specific code that can serve as a "universal grammar"; the explicit receiver strategy allows 
to characterize the equilibrium payoff. 

Noise in communication is relevant to models of persuasion, where the sender wants to 
induce the receiver to take an action. Glazer and Rubinstein (2004; 2006) study binary 
receiver actions; the sender may reveal limited information about the state of nature as 
"evidence". The optimal way to do so is a receiver- optimal mechanism. In a more general 
setting, Kamenica and Gentzkow (2011) allow the sender to commit to a strategy that 
selects a message for each state, assuming the receiver's best response using Bayesian 
updating; the sender may generate noise by selecting the message at random. Subject to 
a certain Bayesian consistency requirement, the sender can commit to her best possible 
strategy. 

Section |2] describes our model and characterizes the Nash equilibrium condition. For 
channels with any number of symbols. Section [3] gives an example that some codes may 
not be Nash codes, shows that receiver- optimal codes are, and discusses the relation to 
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potential functions. In Section]?], we consider binary codes, and state the main Theorem[8} 
which is proved in the Appendix. It requires the condition of "monotonic" decoding when 
ties occur, for example in a fixed order among the states as when they have generic priors. 
In Section]?] it is shown that this is in fact the only general deterministic monotonic tie- 
breaking rule. 



2 Nash codes 

We consider a game of two players, a sender (she) and a receiver (he). First, nature 
chooses a state i from a set Q = {0, 1 , . . . , M — 1 } with positive prior probability qi. Then 
the sender is fully informed about z, and sends a message to the receiver via a noisy 
channel. After receiving the message as output by the channel, the receiver takes an 
action that affects the payoff of both players. 

The channel has finite sets X and Y of input and output symbols, with noise given by 
transition probabilities p{y\x) for each x EX, y . The channel is used n times inde- 
pendently without feedback. When an input x = {x\,. .. ,x„) is transmitted through the 
channel, it is altered to an output y = {jit ■ ■ ^Jn) according to the probability p{y\x) given 

by 

p{y\x) = Y[p{yMj)- (1) 

This is the standard model of a memoryless noisy channel as considered in information 
theory (Cover and Thomas, 1991; MacKay, 2003). 

The sender's strategy is to encode state i by means of a coding function or code c.Q.—^X", 
which we write as c(z) = x'. We call x' the codeword or message for state / in Q., which 
the sender transmits as input to the channel. The code c is completely specified by the list 
of M codewords x^,x^,.. . ,x'^^^, which is called the codebook. 

The receiver's strategy is to decode the channel output given by a probabilistic decoding 
function 

J : y" X a ^ M, (2) 
where d{y, i) is the probability that y is decoded as /. 

Sender and receiver have the common interest that the message is decoded correctly. That 
is, if the receiver decodes the channel output as the state i chosen by nature, then sender 
and receiver get positive payoff Uj and V,, respectively, otherwise both get payoff zero. 
The channel transition probabilities, the transmission length n, and the prior probabilities 
qi and utilities Uj and V, for i in Q. are commonly known to the players. 

We are interested in conditions so that the pair (c, d) defines a Nash equilibrium. In that 
case, we call c, under the assumption that decoding takes place according to d, a Nash 
code. We denote the expected payoffs to sender and receiver by U{c,d) and V{c,d), 
respectively. 
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The code c defines the sender's strategy. The best response of the receiver is the following. 
Given that he receives channel output y in Y", the probability that codeword x' has been 
sent is, by Bayes's law, qip{y\x^)/prob{y), where prob(j) is the overall probability that 
y has been received. The factor l/prob(j) can be disregarded in the maximization of 
the receiver's expected payoff. Hence, a best response of the receiver is to choose with 
positive probability d{yj) only states i so that qiVip{y\x^) is maximal, that is, so that y 
belongs to the set 7, defined by 

Y, = {yeY''\ qiV,piy\x') > qkVkp{y\^) Vfc G Q}. (3) 

Hence, the best response condition for the receiver states that for any y eY" and / G 

d{yj)>0 ^ yeY,. (4) 

If Vi — I for all / G then 7, in ([3]) is the set of channel outputs y so that the channel 
input x' has maximum likelihood. (This term is sometimes used only for uniform prior 
probabilities, e.g. MacKay, 2003, p. 152, which we do not assume.) If the receiver has 
different positive utilities Vi for different states z, then the receiver's best response maxi- 
mizes qiVip{y\x'). 

We say that for a given channel output y, there is a tie between two states i and k (or 
the states are tied) if y G Yt fl 7^. If there are never any ties, then the sets F, for / G Q 
are pairwise disjoint, and the best-response decoding function is deterministic and unique 
according to (|4]). 

We refer to the sets Yt for z G f2 as a "partition" of 7", which constrains the receiver's 
best-response decoding as in (|4]), even though some of these sets may be empty, and they 
may not always be disjoint if there are ties. In any case, 7" = Uiefi^;- 

Suppose that the receiver decodes the channel output with d according to (|3]) and (|4]) for 
the given code c with c{i) = x'. Then (c, J) is a Nash equilibrium if and only if, for any 
state i, it is optimal for the sender to transmit x' and not any other x in as a message. 
When sending x, the expected payoff to the sender in state i is 

u,Y, piy\^)d{yJ). (5) 

yeY" 

When maximizing ([5]), the utility Ui to the sender does not matter as long as it is positive; 
given that the state is i, the sender only cares about the probability that the channel output y 
is decoded as i. We summarize these observations as follows. 



Proposition 1 The code c with decoding function d is a Nash code if and only if the 
receiver decodes channel outputs according to ([3]) and Q, and if and only if in every 
state i the sender transmits codeword c{i) = jc' which fulfills for any other possible channel 
input X in X" 

£ p{y\x') d{y, i) > £ p{y\x) d{y, i) . (6) 

yeY" yeY" 
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3 Receiver-optimal codes 



In this section, we first ask whether every code is a Nash code, assuming that the receiver 
chooses a best response. We give a detailed example that demonstrates that this may not 
be the case, and that we use throughout the section. Then we show that every code that 
maximizes the receiver's payoff is a Nash code. The proof implies that this holds also if 
the receiver's payoff is locally maximal, that is, when changing only a single codeword, 
and the corresponding best response of the receiver, at a time. Finally, we discuss the 
connection with potential functions. 

Consider a channel with three symbols, X = 7 = {0, 1 , 2}, which is used only once {n= 1), 
with the following transition probabilities: 



p{y\x) 


y 

1 2 




X 1 
2 


0.85 0.1 0.05 
0.1 0.65 0.25 
0.3 0.7 



Suppose that nature chooses the two states in {0, 1} with uniform priors go = qi = 1/2. 
The sender's utilities are Uq = 2 when the state is and Ui = S when the state is 1, and 
the receiver's utilities are Vq = S,Vi = 2. 

Consider the codebook c with c(0) = = and c(l) —x^ = l, so the sender codifies the 
two states of nature as the two symbols and 1, respectively. Given the parameters of this 
game and the sender's strategy c, the receiver's strategy assigns to each output symbol in 
{0, 1,2} one state. The following table ^ gives the expected payoff qiVip{y\x') for the 
receiver when the state is / and the output symbol is y. 



qiVip{y\x') 


y 

1 2 




i 

1 


3.4 0.4 0.2 
0.1 0.65 0.25 



Table ([8]) allows us to compute the receiver's best response and the sets 7/ in ([3]). For each 
channel output y, the receiver chooses the state / with highest expected payoff. Hence, he 
decodes the channel output as state because <7oVb p{Q\x^) = 3.4 > 0. 1 = q\V\ p{Q\x^). 
In the same way, he decodes both channel outputs 1 and 2 as state 1 . Notice that there are 
no ties, so the two sets Yq and Y\ are disjoint, and the receiver's best response is unique 
and deterministic. That is, the receiver's best response d is given by d{y, i) = 1 if and only 
if ye Yi, where Yq = {0} and Fi = {1, 2}. 

Is this code c given by the codebook x'^.x^ = 0, 1 a Nash code? 
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Given the partition of Y into Yq and Yi by the receiver strategy d, it easy to compute 
the sender payoff as in (|5]) when the states and 1 are realized. For the first state 0, 
her payoff is UoY.yeY p{y\Q)d{y,0) = UoY.yeYo p{y\^) =2x p{0\0) = 1.7. For the second 
state 1, her payoff is Ui^y^y PiyWiyA) = t/iE,,.er, = 8 x {p{l\l) + p{2\l)) = 

8 X (0.65 + 0.25) = 7.2. The sender's (ex-ante) expected payoff is therefore U{c,d) — 
^ol-V + <?i 7.2 = 4.45. 

In order to check the Nash equilibrium property of (c, J), there should be no code c' so 
that U{c',d) > U{c,d). Consider now the new sender strategy c' with codebook 0,2, 
which differs from code c in the codeword c'(l) = 2 for state 1. The receiver's strategy 
d with Yq and Yi is fixed. State is encoded by the same codeword c(0) = c'(0) = 0, so 
the sender's payoff for that state is 1.7 as before. However, for state 1, the signal sent is 
2 instead of 1. Then the sender's payoff is U\ Y.yeY^ Uxp{y\2) = 8 x (p(l|2) + p{2\2)) = 
8 X (0.3 + 0.7) = 8, which is higher than her payoff 7.2 when sending signal 1. Her 
expected payoff increases to U {c' ,d) = qol.l + qi'i = 4.85. Consequently, the code c 
with codebook 0, 1 is not a Nash code. 

In this example, changing the codebook c to c' improves the sender payoff from U (c, d) 
to U {c',d), where d is the receiver's best-response decoding for code c. In addition, it is 
easily seen that the receiver payoff also improves from V{c,d) to V{c',d), and his payoff 
V{c',d') for the best response d' to c' is possibly even higher. This observation leads us 
to a sufficient condition for Nash codes. 

Definition 2 A receiver-optimal code is a code c with highest expected payoff to the re- 
ceiver, that is, so that 

V{c,d)>V{c,d) 

for any other code c, where d is a best response to c and d is a best response to c. 

Note that in this definition, the expected payoff V{c,d) (and similarly V{c,d)) does not 
depend on the particular best-reponse decoding function d in case d is not unique when 
there are ties, because the receiver's payoff is the same for all best responses d. 

The following is the central theorem of this section. It is proved in three simple stepsj^ 
which give rise to a generalization that we discuss afterwards, along with examples and 
further observations. 

Theorem 3 Every receiver-optimal code is a Nash code. 

Proof. Let c be a receiver-optimal code with codebook x'^.x^,.. . ,x^^^ and associated 
best-response decoding d according to ([3]) and Q. Suppose c is not a Nash code. Then 
there exists a code c with codebook , . . . jx'^^^ so that U (c, d) > U (c, d), that is, 

E^/t/. E P{y\x')d{y,i) > £ piylx^iyj). (9) 

ieO. yeY" ieO. yeY" 

^We are indebted to Drew Fudenberg who suggested steps two and three. 
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Step one: Clearly, (joj) implie^that there exists at least one / G Q so that 



yeY" yeY" 



(10) 



Consider the new code c' which coincides with c except for the codeword for state /, 
where we set c'{i) = x'. So the codebook for c' is x^,... ^x'^^ ,x\x'^\ . . . ,x^^^. By ( 10), 
we also have 



U{c',d) = £ qjUj £ p{yW)d{yJ) +q,U, £ p{y\x')d{y,i) 

jefl, j^i yeY" yeY" 

> E ^jUj E piy\xJ)d{yJ) = U{c,d). 
jeO. yeY" 



(11) 



Step two: In the same manner, ([10]) implies an improvement of the receiver function, that 
is, 

y(c,j) >y(c,j). (12) 



Step three: Let d' be the best response to c', which with ( 12 ) implies 

V{c',d') > V{c',d) > V{c,d). 

Hence, code c' has higher expected receiver payoff than c. This contradicts the assumption 
that c is a receiver-optimal code. □ 

The preceding theorem asserts that there is at least one Nash code. It can be found as a 
code with highest receiver payoff. 





Yo 


Yi 


p{y e Yo 1 x'^) 


piyeYi \x^) 


U 


V 


0,1 


{0} 


{1,2} 


0.85 


0.90 


4.45 


4.30 


0,2 


{0,1} 


{2} 


0.95 


0.70 


3.75 


4.50 


1,0 


{1,2} 


{0} 


0.90 


0.85 


4.30 


4.45 


1,2 


{0,1,2} 


{} 


1.00 


0.00 


1.00 


4.00 


2,0 


{1,2} 


{0} 


1.00 


0.85 


4.40 


4.85 


2,1 


{1,2} 


{0} 


1.00 


0.10 


1.40 


4.10 



(13) 



For our example, the table in (13) lists the six possible codebooks jc*^,jc\ shown in the 
first column, that have distinct codewords {x^ ^ x^). For each code, the receiver's best 
response is unique. The best-response partition Yq.Yi is shown in the second column. 
Using this partition, the third column gives the probabilities p{y ^Yt | x') = Y^yeYiPiyW) 
that the codeword x' is decoded correctly. The overall expected payoffs to sender and 
receiver are shown as U and V. 



^This claim follows also directly from Proposition [l] but we want to refer later to ^ as well. 
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According to the rightmost column in ( |T3[ ), the unique receiver-optimal codebook is 2,0, 
which is a Nash code by Theorem [3j We have already shown that 0, 1 is not a Nash code. 
Note, however, that this is the code with highest sender payoff. Hence, a "sender-optimal" 
code is not necessarily a Nash code. 



It also easily seen from ( 13 ) that 1 , and 2, 1 are not Nash codes, either: Both codebooks 
have the same best-response partition Yq = {1,2} and Yi = {0} as the codebook 2,0, but 
have lower payoff to the sender, so the sender can profitably deviate from 1,0 or 2, 1 
to 2,0. 



In ( 13 ), the codebook 1,2 has the interesting property that the receiver decodes any chan- 
nel output y as state 0; this holds because even the unaltered codeword = 2, when 
received as j = 2, fulfills qiVi p{2\x^) = 1 x 0.7 < 4 x 0.25 = qQVop{2\x^), so the re- 
ceiver prefers to decode it as state 0. So here Yq = Y" and Yi is the empty set. Given 
that the receiver's action is the same for any received channel output, the sender cannot 
improve her payoff by transmitting anything else. So the codebook 1, 2 is a Nash code. 

In fact, any sender-receiver game, irrespective of the players' payoffs, has a trivial "pool- 
ing" equilibrium where the sender's signal does not depend on the statej^ and the re- 
ceiver's best response decodes the uninformative channel output as the state / with highest 
expected payoff, in our game qiVj. In our example, such codes have equal codewords, with 
x'^ —x^, all decoded as state 0; they are not listed in ( 13 ). The codebook 1, 2 is potentially 



informative, but the receiver ignores the channel output due to his utility function. 



Finally, the codebook c with codebook 0,2 in (13) is also a Nash equilibrium, which 
is seen as follows. Let d be the best response to c, with Yq = {0, 1}, Fi = {2}. As 
shown in the proof of Theorem [3} if the sender could profitably deviate from c to c, then 
she could also profitably deviate to a code c' that differs from c in one codeword only. 
The possible codes c' have codebooks 1,2, where c(0) is changed to c'(0) = 1, and 0, 1, 
where c(l) is changed to c'(l) = 1. In the first case, by (jv]), changing c(0) from to 1 
changes qoUoLyeYoPiy\0) = 0.S5 + 0.1 = 0.95 to qoUoZyeYoPiyl'^) =0-1+0.65 = 0.75, 
which is not an improvement. In the second case, changing c(l) from 2 to 1 changes 
giUi ZyeYi P{y\^) = 4 X 0.70 = 2.8 to qiUi E^.^yj p{y\l) = 4 x 0.25 = 1, which is not an 
improvement either. So c is indeed a Nash code. 



The code c with codebook 0, 2 is also seen to be a Nash code with the help of table ( 13 1 
according to the proof of Theorem |3j Namely, it suffices to look for profitable sender de- 
viations c' where only one codeword is altered, which would also imply an improvement 
to the receiver's payoff from V{c,d) to V{c',d), and hence certainly an improvement to 
his payoff V(c', J') where d' is the best response to c' . For the two possible codes c' given 
by 1,2 and 0, 1, the receiver payoff V does not improve according to ( [T3| ), so c is a Nash 
code. By this reasoning, any "locally" receiver-optimal code, according the following 
definition, is also a Nash code. 



Definition 4 A locally receiver-optimal code is a code c so that no code c' that dijfers 
from c in only a single codeword gives higher expected payojfto the receiver That is, for 

^In Crawford and Sobel (1982), it is the uninformative equilibrium with a single partition class for the 
sender. 
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all c' with c'{i) 7^ c{i) for some state i, and c'{j) = c{j) for all j 7^ /, 

V{c,d)>V{c',d') 
where d is a best response to c and d' is a best response to c' . 

Theorem 5 Every locally receiver- optimal code is a Nash code. 

Proof. Apply the proof of Theorem [3] from Step two onwards. □ 

Clearly, every receiver-optimal code is also locally receiver-optimal, so Theorem|3]can be 
considered as a corollary to the stronger Theorem |5j 

Local receiver-optimality is more easily verified than global receiver-optimality, because 
much fewer codes c' have to be considered as possible improvements for the receiver 
payoff according to Definition |4j A locally receiver-optimal code can be reached by 
iterating profitable changes of single codewords at a time. This simplifies the search for 
Nash codes. 

To conclude this section, we consider the connection to potential games which also allow 
for iterative improvements in order to find a Nash equilibrium. As in Monderer and 
Shapley (1996, p. 127), consider a game in strategic form with finite player set A^, and 
pure strategy set Si and utility function u' for each player i. Then the game has an (ordinal) 
potential function P : Wj^nSj — )■ M if for all i G A'^ and 5^' G Wj^iSj and s\s^ E Si, 

u\s-',s')>u'{s-',s') ^ P{s-',s')>P{s-',s'). (14) 

The question is if in our game, the receiver's payoff is a potential function]^ The following 
proposition gives an answer. 

Proposition 6 Consider the game with M + I players where for each state i in Q., a 
separate agent i transmits a codeword c(z) over the channel, which defines a function 
c : t2 — 7- X", and where the receiver decodes each channel output with a decoding function 
d as before. Each agent receives the same payoff U{c,d) as the original sender Then 

(a) Any Nash equilibrium {c,d) of the {M + I) -player game is a Nash equilibrium of 
the original two-player game, and vice versa. 

(b) The receiver's expected payoff is a potential function for the {M + \)-player game. 

(c) The receiver 's expected payoff is not necessarily a potential function for the original 
two-player game. 

Proof. Every profile c of M strategies for the agents in the (M-|- 1) -player game can 
be seen as a sender strategy in the original game, and vice versa. To see (a), let {c,d) 

^We thank Rann Smorodinsky for raising this question. 
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be a Nash equilibrium of the (M+ 1) -player game. If there was a profitable deviation c 
from c for the sender in the two-player game as in (|9]), then there would also be a profitable 



deviation c' that changes only one codeword c(z) as in ( 1 1 ), which is a profitable deviation 
for agent z, a contradiction. The "vice versa" part of (a) holds because any profitable 
deviation of a single agent is also a deviation for the sender in the original game. 



Assertion (b) holds because for any imQ.,{\\) is, via ([TO]), equivalent to ( 12 1. 



To see (c), consider our example (|7]) with c and c given by the codebooks 0, 1 and 1,2, 
respectively, and d decoding channel outputs y = 0, 1, 2 as states 0,0, 1 , respectively. Then 
the payoffs to sender and receiver are 

U{c,d) =^oC/o(p(0|0)-fp(l|0))+^if/i/?(2|l) = 1 X (0.85 + 0.1) -F4x 0.25 = 1.95 

y(c,j) =^oVb(;?(o|o)+p(i|o))+^iyi p{i\\) =4x (o.85+o.i) + i xo.25 =4.05 

[/(c,J) =^of/o(p(0|l)+p(l|l))+^it/i;?(2|2) = 1 X (0.1 +0.65) +4 x 0.7 =3.55 
y(c,J) =^oVb(;?(0|l) +p(l 1 1)) +^1^1 p(2|2) =4x (0.1+0.65) + 1 xO.7 =3.7 



which shows that (14) does not hold with m' as sender payoff and P as receiver payoff, 
because these payoffs move in opposite directions when changing the sender's strategy 
from c to c, for this J. □ 

A global maximum of the potential function gives a Nash equilibrium of the potential 
game (Monderer and Shapley, 1996, Lemma 2.1). Hence, (a) and (b) of Proposition [6] 
imply that a maximum of the receiver payoff defines a Nash equilibrium, as stated in 
Theorem [3} It is also known that a "local" maximum of the potential function defines 
a Nash equilibrium (Monderer and Shapley, 1996, footnote 4). However, this does not 
imply our Theorem [5| The reason is that in a local maximum of the potential function, 
the function cannot be improved by unilaterally changing a single player's strategy. In 
contrast, in a locally receiver-optimal code, the receiver's payoff cannot be improved by 
changing a single codeword together with the receiver's best response. For example, the 



Nash code 1,2 in (13) with best response partition Iq = {0, 1,2} is not locally receiver- 



optimal, but is a "local maximum" of the receiver payoff. 

In a potential game, improvements of the potential function can be used for dynamics that 
lead to Nash equilibria. For our games, the study of such dynamics may be an interesting 
topic for future research. 



4 Binary channels and monotonic decoding 

The main result of this section concerns the important binary channel with X = Y = 
{0, 1}. The two possible symbols and 1 for a single use of the channel are called hits. 
The binary channel is the basic model for the transmission of digital data and of central 
theoretical and practical importance in information theory (see, for example. Cover and 
Thomas, 1991, or MacKay, 2003). 
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We assume that the channel errors £q = p{\\0) and Ci = jc»(0| 1) fulfill 



£o>0, ei>0, eo + ei<l, (15) 

where Co + Ci < 1 is equivalent to either of the inequalities 

l-eo>ei, l-ei>eo. (16) 

These assert that a received bit is more likely to have been sent as (with probability 
1 — Co) than sent as bit 1 and received with error (with probability £i), and similarly that 
a received bit 1 is more likely to have been sent as 1 than received erroneously. It may 
still happen that bit 0, for example, is transmitted with higher probability incorrectly than 
correctly, for example if Co = 3/4 and Ci = 1/8. 



Condition ( [15) ) can be assumed with very little loss of generality. If Cq = Ci = then the 
channel is error-free and every message can be decoded perfectly. If Cq + Ci = 1 then the 
channel output is independent of the input and no information can be transmitted. For 
Co + Ci > 1 the signal is more likely to be inverted than not, so that one obtains ( [T5] ) by 
exchanging and 1 in 7. 



Condition ( 15 1 does exclude the interesting case of a "Z-channel" that has only one-sided 
errors, that is, Cq = or £i = 0. We assume instead that this is modelled by vanishingly 
small error probabilities, in order to avoid case distinctions about channel outputs y in 



Y" that cannot occur for some inputs x when Cq = or Ci = 0. With ( 15 ), every channel 
output y has positive, although possibly very small, probability. 



The binary channel is symmetric when Cq = Ci = £ > 0, where e< 1/2 by (15). 



The binary channel is used n times independently. A code c : Q. ^ X" for X = {0, 1 } is 
also called a binary code. The main result of this section (Theorem [8] below) states that 
any binary code is a Nash code provided the decoding is monotone. This monotonicity 
condition concerns how the receiver resolves ties when a received channel output y can 
be decoded in more than one way. 

We first consider an example of a binary code that shows that the equilibrium property 
may depend on how the receiver deals with ties. Assume that the channel is symmetric 
with error probability e. Let M = 4, n = 3, and consider the codebook ,x^,x^ given 
by 000, 100,010,001. All four states / have equal prior probabilities qi= 1/4 and equal 
sender and receiver utilities t/,- = Vi = 1 . The sets 7, in (|3]) are given by 

1^0 = {000}, 72 = {010,011, 110, 111}, 

7i = {100,101,110,111}, 73 = {001,011,101,111}. ^ ' 

This shows that for any channel output y other than an original codeword x\ there are 
ties between at least two states. For example, 1 10 G 7i fi 72 because 1 10 is received with 
probability £(1 — e)^ for x^ and x^ as channel input. For y = 111, all three states 1,2,3 
are tied. 



^Hernandez, Urbano, and Vila (2010b) show that for a binary noisy channel, the decoding rule of "joint 
typicality" used in a standard proof of Shannon's channel coding theorem (Cover and Thomas, 1991, Sec- 
tion 8.7) may not define a Nash equilibrium. 
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Figure 1: Binary code with four codewords 000, 100, 010, 001, with non-monotonic de- 
coding (left) and monotonic decoding (right). The light- grey sets indicate how a channel 
output is decoded. 



Consider first the case that the receiver decodes the channel outputs 110,011, 101 as states 
1,2,3, respectively, that is, according to 



J(110,l) = l, J(011,2) = l, J(101,3) = l. 



(18) 



1,2,3 by ( |17[ ). The situation is symmetric 
1; the case of a determin- 



We claim that this cannot be a Nash code, irrespective of the decoding probabilities 
J(lll,z) which can be positive for any i 
for / = 1,2,3, so assume that J(lll,/) is positive when / 
istic decoding where (i(lll,l) = 1 is shown on the left in Figure [T| Then the receiver 
decodes y as state 1 with positive probability when y equals 100, 110, or 111. When 
= 100 is sent, these channel outputs are received with probabilities (1 — e)^, e(l — e)^, 
and e^(l — e), respectively, so the sender payoff is 

(l-e)3 + e(l-e)2 + e2(i-e)j(iii,i) 

in ([5]). Given this decoding, the sender can improve her payoff in state 1 by sending 
X = no rather than x^ = 100 because then the probabilities of the channel outputs 100 
and 110 are just exchanged, whereas the probability that output 111 is decoded as state 1 
increases to e(l — e)^ J(lll, 1); that is, given this decoding, sending .x; = 110 is more 
likely to be decoded correctly as state 1 than sending x^ = 100. This violates J6|. 



The problem with the decoding in ( 18 1 is that when the receiver is tied between states 1, 
2, and 3 when the channel output isy' = 1 1 1, he decodes y' as state 1 with positive prob- 
ability J(l 1 1, 1), but when he is tied between even fewer states 1 and 3 when receiving 
y= 101 , that decoding probability J( 101 , 1 ) decreases to zero. This violates the following 
monotonicity condition. 



Definition 7 Consider a codebook with codewords x' for i G ^ and a decoding function 
d in Then d is called monotonic if it is a best response decoding function with ^ 
and (|4|, and if for all y,y' G Y" and states i, 



T = {keQ.\yeYk}, T' = {keQ.\y eYk}, ieTCT' 



d{y,i)>d[y'j). (19) 
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In ( 19 1, r is the set of tied states for channel output y, and T' is the set of tied states for 
channel output y' , and both sets include state /. The condition states that the probability 
of decoding the channel output as state i can only decrease when the set of tied states 



increases. 



We study the monotonicity condition in Definition |7] in more detail in the next section. 
We conclude with the main result of this section; its proof and some technical comments 
are given in the Appendix. 

Theorem 8 Every monotonically decoded binary code is a Nash code. 



5 Monotonic decoding functions 

When is a decoding function monotonic? Suppose there is some fixed order on the set of 
states so that always the first tied state is chosen according to that order. In this section, 
we show that this is essentially the only way to break ties with a deterministic monotonic 
decoding function. 

The monotonicity condition in Definition |7] implies 

T = {ke^\yeYk}, T' = {ke^\y eYk}, ieT = T' ^ d{y,i)=d{y',i). (20) 

That is, the decoding probability d{y,i) of state i may only depend on the set T of states 
that are tied with z, but not on the received channel output y. For that reason, we can define 
a monotonic decoding function also as a function d{T,i) of the set T of best-response 
states, 

d{T,i):=d{y,i) if T = {k e ^ \y eYk} (21) 



which is well defined by (20). 



A natural example of a probabilistic monotonic decoding function is to break ties uni- 
formly with d{T,i) = l/\T\ for i e T. A more general monotonic decoding function is 
d{TJ) = Wi/Y^j^^jWk for / G T with a fixed positive weight for each state k. There 
are many other probabilistic monotonic decoding functions. For example, if ties between 
three or more states are broken uniformly, then ties between only two states are decoded 
monotonically if the decoding probabilities for both tied states are at least 1/3. 

We will show that deterministic monotonic decoding functions are more restrictive. Con- 
sider again the example ( 18 1 with J(lll,l) = 1 as shown on the left in Figure [Tj (Note 



that this decoding is not monotonic but fulfills the weaker condition ( [20] ) which therefore 
does not suffice to guarantee a Nash code.) 



The following decoding function, changed from (18) so that 101 is decoded as state 1, is 
monotonic, 

c^(110,l) = l, J(011,2) = l ^^(101,1) = 1, J(lll,l) = l, (22) 
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shown in the right picture in Figure [T] This is a Nash code because all y in the set Yi, see 



( [17] ), are decoded as state 1; whichever x in Yi the sender decides to transmit instead of 
there is one y in Yi for which p{y\x) = £^(1 — e), so that the payoff to the sender in ^ 
does not increase by changing from to x. 



As the right picture in Figure [T] shows, the decoding function in (22 1 can be defined by the 



following condition: Consider a fixed linear order -< on f2 (in this case -< 1 -< 2 -< 3) so 
that 

d{TJ) = l iET and \/kET, k^i : i^k. (23) 



A fixed-order decoding function d fulfills (23) for some -<. Such a decoding function is 
deterministic and clearly mono tonic. 

We want to show that any deterministic monotonic decoding function is a fixed-order 
decoding function. We have to make the additional assumption that the decoding function 
J(r, /■) is general in the sense that it is defined for any nonempty set T of states, not only 



the sets T that occur as sets of tied states for some channel output j as in (21 ) 



Without this assumption, we could add to the above example another state with codeword 
= 1 11 so that the "circular" decoding function in ( 18 1 is monotonic and gives a Nash 



code, but is clearly not a fixed-order decoding function. It is reasonable to require that 
a decoding function is defined generally and does not just coincidentally lead to a Nash 



code because certain ties do not occur (as argued above, with the decoding ( 18 1 we do not 
have a Nash code when ties have to be resolved for y = 111). 



For general decoding functions, (19) translates to the requirement that for any T, T' C Q., 



ieTCT' ^ d{T,i)>d{T'j). (24) 

Proposition 9 Every general deterministic monotonic decoding function is a fixed-order 
decoding function. 

Proof Because the decoding function is deterministic, d{T, i) G {0, 1 } for any nonempty 
set T of states. Define the following binary relation -< on 

i^i d{{ij},i) = l. 

Clearly, either i -< j or j -< i for any two states /, j. We claim that -< is transitive, that 
is, if i -< j and j -< k, then i ~< k. Otherwise, there would be a "cycle" of distinct z, j, k 
with / -< j and j -< k and k ~< i. This is symmetric in i,j,k, so assume d{{i, j,k}J) = 1 
and therefore d{{i,j,k}J) = and d{{i, j,k},k) = 0. However, with T = {i,k} and 



T' = {i,j,k} we have d{T,i) = < 1 = d{T',i), which contradicts (24) 



So -< defines a linear order on Q. We show that ( |23) ) holds, that is, for any nonempty set 
of states T' the state / so that d{T',i) = 1 is the state / that fulfills i -< k for all k G T'. 
This holds trivially and by definition if T' has at most two elements, otherwise, if -< z 
for some k G T', then we obtain with T = {i^k} the same contradiction d{T,i) = < 1 = 
d{T' ,i) as before. So the decoded state is chosen according to the fixed order -< on t2 as 
claimed. □ 
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When the prior probabihties qi or the receiver utilities Vi for the states / are generic, 
then Yi in (|3]) is always a singleton, so no ties occur and decoding is deterministic. One 
can make any prior probabilities generic by perturbing them minimally so that ties are 
broken uniquely but decoding is otherwise unaffected. That is, if i and j are tied for 
some y because qiVip{y\x^) = qjVjp{y\x^), this tie is broken in favor of / by slightly 
increasing qi, which will then always happen whenever i and j are tied originally. This 
induces a fixed-order decoding, where any linear order among the states can be chosen. 
Thus, Proposition |9] asserts that general deterministic monotonic decoding functions are 
those obtained by generic perturbation of the priors. 



Finally, we observe that the above codebook 000, 100,010,001 with decoding as in (22) 
defines a Nash code (and if priors are minimally perturbed so that q\ > q2> ^3 there 
are no ties and decoding is unique), but this code is not locally optimal as in Theorem |5] 
Namely, by changing the codeword 100 to 110, all possible channel outputs y differ in 
at most bit from one of the four codewords, which clearly improves the payoff to the 
receiver. So not all binary Nash codes are locally receiver-optimal. 



Appendix: Proof of Theorem |8 

We first give an outline of the proof of Theorem [8] We want to show that for each state i 
in ^2, the sender maximizes the probability of correct decoding by sending the prescribed 
codeword x\ so that (|6| holds for any x G X". For any channel output y, comparing 
p(y|jc') and p{y\x) is only affected by the bits where and x differ, defined by the set D 



in (25) below. For these bits, the corresponding channel outputs are ordered according 



to how far they agree with x^ (and hence differ from x), indicated by the subset A of D 



in (30). The key property is that with increasing A, such a channel output is more likely 



to be decoded as state /, which is stated in (37) and the main technical challenge, proved 



with the help of the monotonicity assumption (19). The payoff in (|5]) is a multilinear 
function of the probabilities for receiving the individual output bits, see ( [32] ) and ( [38] ). By 
considering this multilinear expression for each of the transmitted bits in D and using the 



error inequalities ( 16 ), the monotonicity condition (37 ) translates to the inequality ([6]), as 



shown in (43) 



Proof of Theorem^ Conditions (|3]) and state that the receiver uses a best response, 
so the equilibrium property holds on the receiver's side. 

Let / G ^2 be the state chosen by nature. Let x in X" be an arbitrary alternative message to 
the codeword x\ We want to prove Let S and D be the sets of bits in x and x^ that are 
the same and different, respectively, that is, 

S={j\xj=x), l<j<n}, D = {j\xj^x), \ <j<n}. (25) 

For any sets Z and A and elements Zj in Z for j G A we write za = {zj)jeA and denote the 
set of these vectors za by 

For any z G {0, 1}" we write z = {zs,zd), so that with ([T]) 

p{y\z) = p{ys\zs) ■p{yD\zD)- (26) 
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In particular, by ( [25] ), 

p{yW) = piysWs) -piyDlxo) =p{ys\xs)-p{yD\xD), ^^y) 

p{y\x) = p{ys\xs) -piyolxD)- 
Fix ys EY^. We will show that 

Piiys,yD)\x')d{{ys,yD),i)> p{{ys,yD)\x)d{{ys,yD),i)- (28) 



Because y = {ys^yo) for y E Y", summation of (28 1 over all ys G 7 then implies ([6j 
By ([3]), J = (^5,^0) G Yi if and only if for all keSl, 

GiViPiyslx's) ■ Piyolxo) > qkVkP{ys\xs) ■ Piyolxo)- (29) 

If equality holds in (29 1, then y EY;^ and there is a tie between states i and k, which affects 
d{y, i) where we will use ( 19 1. 

It is useful to consider the channel outputs yo (for the bits in D) according to how they 
agree with x^. For A C D, let 

Clearly, any y^ in Y^ can be written as = for a unique subset A of D. 
Let A (ID and = }^ G 7^. For I eQ., consider the sets 



£>0 = { J G D I x^. = } , A'^ = {j^D\yj=x) = 0}, 
D[ = { J e D I 4 = 1 }, A\={jED\ yj =4 = 1}, 



(31) 



J 

so that A = U A\ . Then according to ([!]), 

P(yz>|^) = (1 - £o)l^°l e^"""' (1 - ef^-''^ • (32) 

For G a, let 

Qk{A) =p{/oWo)-Rk-PiM), Rk='-§^P^. (33) 

qiViPiy six's) 



Then (|29]) is equivalent to Qk{A) > for all k E Q. 

Let sign[?] for ? G M be the usual sign function defined by 

-1 ift<0, 
sign[fj = <( ift = 0, 
1 if?>0. 
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Let j E D — A, where we write AU j instead of A U { j}. We will show 



sign[<2^(AUj)] >sign[<2;t(A)]. 



(34) 



Because j ^A =Aq{JA\, either j eD^-Aq or j G D\ -A\. 



Consider the case j G Dq —A'q, that is, = and 3^ = 1 by (31 1. The change from A to 

AU j means that y^-^ is obtained from yj) by changing yj from 1 to 0, so that the input 
bit x'j is now correctly transmitted (which happens with probability 1 — Cq) rather than 
incorrectly (probability Cq). By ([32]), this means 



l-gQ 

£0 



(35) 



Note that it is possible that Cq > 1 — Co, which means that the "more correct" channel 
output y^-' (relative to the input bits in xd) is less likely than y^; this is why we consider 
signs in ([34]) because QkiA U j) > Qk{A) is not generally true. 



When comparing the output bit 3/^ = 1 with the input bit x^j from the codeword x^, either 
j G Dq —Aq, in which case holds with k instead of /, and, by (33), Qk{A U j) = 
(1 — eo)/£o ■ Qki^), so that ^1 holds with equahty; or, alternatively, j G A\, that is, 
x^j — I- Changing yj from 1 to to obtain yj^-^ implies that the input x'j is now transmitted 
with error, so that 



1-ei 



^14)- 



Using (33 ) and (35 ), this means 



QkiAUj) 



1 -£o 
£0 



p{yi\x^)-R, 



k ■ 



EqEi 



(l-eo)(l-£i; 



> 



l-£o 
£0 



QkiA) 



by ( 16). Again, (34) holds, where here the sign of Qk{AL} j) relative to that of Qk{A) may 



strictly increase. 

The case j G D\ —A\ where x'j = 1 and 3^ = is entirely analogous, by exchanging 
and 1 (and thus £q and £1) in the preceding reasoning. This shows (]34]). 



For A C D, let 



We show that for j eD—A, 



With y = {ys,y^^) and y' = (3^5,3^), let T and T' be defined as in ( 19). We are going to 



hA = d{{ys,y^),i) 



hAUj > hA- 



(36) 



(37) 



show that T C T' . As observed after y' G 7, if and only if Qk{A) > for all k e Q., 
and y eYi if and only if Qk{A U j) > for all k e Q.. If Qk{A) < for some k e Q., then 
/za = by ([4]), and (37 ) holds trivially. So we can assume that Qk{A) > for all keCl and 
therefore Qk{A U j) > for allkeO. by ([34]), that is, i G T, and the sets of all states k that 
are tied with i are given by 



T' = {kea\ QkiA) =0}, T = {ke^l\Qk{AU j) = 0}, 
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which imphes that T C r' by (34). Using the assumption (19) then implies (37) as 
claimed. 



(38) 



Consider now the function / : [0, 1]^ — > M which for z G [0, 1]^ is defined by 

AdD leA leD-A 

which is the unique multilinear interpolation of the values Ha defined on the vertices 
(1a,0o_a) of the unit cube [0, 1]^, where (Ia^Oo-a)] is 1 for j G A and otherwise, with 
/(U,0z,_A) = /^Aby(l38]). 



The monotonicity (37) extends to the monotonicity of f{zj,ZD-j) in each variable Zj, 
where we write D — j for D — {j} and zd = {zj-,ZD-j), because by ( [38] ), 

fiZj.ZD-j)= £ hA\[zi n i^-Zl)+Zj- £ {hAyjj-hA)\[zi \[ {l-Zl). 

ACD-j leA leD-A-j ACD-j leA leD-A-j 



That is, because hA\jj — /?a > by (37 ) and all products are nonnegative, 

l>Zj>z'j>0 ^ f{Zj.ZD-j)>f{z!j.ZD-j) U^D). 



Using ([31]), let 

Do = Di), Di=D\, Ao=a|), Ai 
and define and zd in [0, 1]^ by 

z) = \- Co, Zj = £i for i e Do, 
z'=l-£i, Zj = £o forjeDi. 



Then z'^ > zd in each component by (16). Using (39) inductively shows 

/(zW >/(zd). 



(39) 



(40) 



(41) 



(42) 



The grand finale is to expand ( |38| ), using p]} and ( [40| ), to 

f{zD)= £ /^AouA, n^' n i^-zi)u^' n (i-^^) 

A()CDo,AiCDi leAo leDo-A() leAi leDi-Ai 



and to observe that by (|4T|, (|3T]), ([32]) for I = i, ([36]), ([42]), ([25]) and again ([41]) and (|32j) 
forx^ =xd, 

E i^(3^kD)^((3'5,3^),0 =/(4) >/(2z)) = E p()^M^((y5,3^),0- (43) 



Multiplying this inequality by p{ys\xs) on both sides and using (27 ) then gives (28 ) (with 
yo written as y^), which was to be shown. 



□ 



We conclude with two small remarks: 



First, the equilibrium condition for the sender does not necessarily hold strictly; if all 
codewords have the same bit in one particular position, then that bit is ignored by the 
receiver and correspondingly can be altered by the sender in any codeword. 

Second, the preceding proof works also if each of the n times that the channel is used 
independently, different error probabilities apply, as long as these are common knowledge. 
We have not made tfiis assumption to avoid further notational complications. 
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